SPACEc: Cell Segmentation¶

The aim of cell segmentation is to identify each cell within a given image and derive single cell data from the image. For that we provide two commonly used approaches: Deepcell Mesmer and Cellpose. Mesmer is a deep learning-enabled segmentation algorithm that works out-of-the-box for most multiplexed images. Apart from that we provide Cellpose that is a deep learning-enabled segmentation algorithm as well but provides different models that can be directly employed. Additionally, Cellpose allows users to easily train their own models.

For the purpose of this tutorial we will use Mesmer.

The steps of the script are:

  1. Deciding on the image channels for segmentation
  2. Running segmentation
  3. Quality control the segmented images
  4. Store the data for further processing
In [1]:
# import spacec first
import spacec as sp

#import standard packages
import os
import warnings
import matplotlib
import pickle
warnings.filterwarnings('ignore')

# set the default color map to viridis, the below paramters can be chanaged
matplotlib.rcParams["image.cmap"] = 'viridis'
/miniforge/envs/spacec/lib/python3.10/site-packages/louvain/__init__.py:54: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import get_distribution, DistributionNotFound
2026-01-31 19:42:34.275531: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /.singularity.d/libs
2026-01-31 19:42:34.275577: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
In [2]:
# Your absolute path
root_path = "/home/jiawu2/SPACEc_image"

# Let Python handle the slash
data_path = os.path.join(root_path, 'data/') 
output_dir = os.path.join(root_path, 'results/')

os.makedirs(output_dir, exist_ok=True)

# Print to verify
print("Data is at:", data_path)
Data is at: /home/jiawu2/SPACEc_image/data/

If you want to use GPU acceleration for the segmentation you should ensure that CUDA is installed and the GPU detected by the python environment. If you have a compatible Nvidia card but never installed CUDA Toolkit before you can find installation instructions here: https://developer.nvidia.com/cuda-downloads

In [ ]:
#check if GPU availability
!nvcc --version
!nvidia-smi
In [ ]:
sp.hf.check_for_gpu()

Cell segmentation¶

NOTE: Our segmentation function features a parameter called 'input_format'. This parameter defines what input data the function accepts. If set to 'Multichannel' the function expects a single multichannel tiff file, if set to 'Channels' the function expects a folder with single Tiff files (no channelnames.txt required) and if set to 'CODEX' the function reads the output of the classic first gen CODEX setup.

Before committing to potentially time intense segmentation it might be useful to visualize the segmentation channels. In this tutorial we provide both nuclei and membrane channels. Especially if no general membrane marker is available it is useful to combine membrane markers as shown below.

In [4]:
# (optional, one can just use nuclei for segmentation)
# Visualize membrane channels to use for cell segmentation 

sp.pl.segmentation_ch(
    file_name = output_dir + 'reg001_X01_Y01_Z01.tif', # image for segmentation
    channel_file = data_path + 'channelnames.txt', # all channels used for staining
    output_dir = output_dir, #
    extra_seg_ch_list = ["CD3", "CD34"], #default is None; if provide more than one channel, then they will be combined
    nuclei_channel = 'DAPI', # channel to use for nuclei segmentation
    input_format = 'Multichannel', 
)
Formatting Multichannel image (59 channels)...
Formatted 59 Multichannel channels.
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
No description has been provided for this image

After deciding on the channels for segmentation, segmentation is performed with the 'cell_segmentation' function. Besides choosing the channels the function allows to select the segmentation model. The function expects a multichannel tif file and a channel names file as input (please see our example data as an example). The segmentation output is stored as csv file.

In [9]:
# choose between cellpose or mesmer for segmentation
# first image
# seg_output contains {'img': img, 'image_dict': image_dict, 'masks': masks}
seg_output1 = sp.tl.cell_segmentation(
    file_name = output_dir + 'reg001_X01_Y01_Z01.tif',
    channel_file = data_path + 'channelnames.txt',
    output_dir = output_dir,
    seg_method ='mesmer', # cellpose or mesmer
    nuclei_channel = 'DAPI',
    output_fname = 'tonsil1',
    membrane_channel_list = ["CD3", "CD34"], #default is None; if provide more than one channel, then they will be combined
    compartment = 'whole-cell', # mesmer # segment whole cells or nuclei only
    input_format ='Multichannel', # Phenocycler or codex
    resize_factor=1, # default is 1; if the image is too large, lower the value. Lower values will speed up the segmentation but may reduce the accuracy.
    size_cutoff = 0)
--- Initializing Segmentation Pipeline ---
No GPU detected by TensorFlow.
Output basename: tonsil1
Segmentation method: mesmer
Differentiate Nucleus/Cytoplasm: False
--- Loading Image Data (Format: Multichannel) ---
Loading image: /home/jiawu2/SPACEc_image/results/reg001_X01_Y01_Z01.tif
Loaded image shape: (59, 3122, 2866)
Loaded 59 channel names from: /home/jiawu2/SPACEc_image/data/channelnames.txt
Formatting Multichannel image (59 channels)...
Formatted 59 Multichannel channels.
Image dictionary created with 59 channels.

--- Preparing Segmentation Inputs ---
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
Segmentation dictionary prepared with channels: ['DAPI', 'segmentation_channel']
Resize factor is 1, skipping image resizing.
Image shape for segmentation: (3122, 2866)

--- Processing Full Image for Segmentation ---

--- Segmentation Mode: Standard ---
Processing full image...
Running Mesmer segmentation: compartment='whole-cell', mpp=0.5
Found existing Mesmer model at: models/Mesmer_model/MultiplexSegmentation
Loading Mesmer model...
Mesmer model loaded successfully.
Predicting with Mesmer...
Extracted Mesmer mask with shape: (3122, 2866), max label: 28749
Resizing final mask to original image shape...

--- Extracting Features ---
Quantifying features for segmented objects...
--- Starting Feature Extraction ---
Calculating morphological features...
Calculated initial morphology for 28749 objects.
Filtering objects with area < 0 pixels...
Found 28749 objects after size filtering.
Creating filtered mask for intensity calculation...
Calculating mean intensities...
Processing intensities on full images.
Processing channels: 100%|██████████████████████████████████████████████████████████████| 59/59 [00:25<00:00,  2.28it/s]
Combining morphology and intensity features...
Successfully saved features for 28749 objects to /home/jiawu2/SPACEc_image/results/tonsil1_features.csv
--- Feature Extraction Complete ---
Saved features to /home/jiawu2/SPACEc_image/results/tonsil1_features.csv

--- Segmentation Pipeline Complete ---
In [10]:
# second image
# choose the method that is consistent of the first image for a more comparable result
# seg_output contains {'img': img, 'image_dict': image_dict, 'masks': masks}
seg_output2 = sp.tl.cell_segmentation(
    file_name = output_dir + 'reg002_X01_Y01_Z01.tif',
    channel_file = data_path + 'channelnames.txt',
    output_dir = output_dir,
    output_fname = 'tonsil2',
    seg_method ='mesmer', # cellpose or mesmer
    nuclei_channel = 'DAPI',
    membrane_channel_list = ["CD3", "CD34"], #default is None #default is None; if provide more than one channel, then they will be combined
    input_format ='Multichannel', # Phenocycler or codex
    compartment = 'whole-cell', # mesmer # segment whole cells or nuclei only
    resize_factor=1, # default is 1; if the image is too large, lower the value. Lower values will speed up the segmentation but may reduce the accuracy.
    size_cutoff = 0) 
--- Initializing Segmentation Pipeline ---
No GPU detected by TensorFlow.
Output basename: tonsil2
Segmentation method: mesmer
Differentiate Nucleus/Cytoplasm: False
--- Loading Image Data (Format: Multichannel) ---
Loading image: /home/jiawu2/SPACEc_image/results/reg002_X01_Y01_Z01.tif
Loaded image shape: (59, 2546, 2660)
Loaded 59 channel names from: /home/jiawu2/SPACEc_image/data/channelnames.txt
Formatting Multichannel image (59 channels)...
Formatted 59 Multichannel channels.
Image dictionary created with 59 channels.

--- Preparing Segmentation Inputs ---
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
Segmentation dictionary prepared with channels: ['DAPI', 'segmentation_channel']
Resize factor is 1, skipping image resizing.
Image shape for segmentation: (2546, 2660)

--- Processing Full Image for Segmentation ---

--- Segmentation Mode: Standard ---
Processing full image...
Running Mesmer segmentation: compartment='whole-cell', mpp=0.5
Found existing Mesmer model at: models/Mesmer_model/MultiplexSegmentation
Loading Mesmer model...
Mesmer model loaded successfully.
Predicting with Mesmer...
Extracted Mesmer mask with shape: (2546, 2660), max label: 23048
Resizing final mask to original image shape...

--- Extracting Features ---
Quantifying features for segmented objects...
--- Starting Feature Extraction ---
Calculating morphological features...
Calculated initial morphology for 23048 objects.
Filtering objects with area < 0 pixels...
Found 23048 objects after size filtering.
Creating filtered mask for intensity calculation...
Calculating mean intensities...
Processing intensities on full images.
Processing channels: 100%|██████████████████████████████████████████████████████████████| 59/59 [00:21<00:00,  2.81it/s]
Combining morphology and intensity features...
Successfully saved features for 23048 objects to /home/jiawu2/SPACEc_image/results/tonsil2_features.csv
--- Feature Extraction Complete ---
Saved features to /home/jiawu2/SPACEc_image/results/tonsil2_features.csv

--- Segmentation Pipeline Complete ---

In addition to the mesmer segmentation that is used in our example you can use cellpose as shown in the example below.

In [11]:
seg_output_cellpose = sp.tl.cell_segmentation(
    file_name = output_dir + 'reg002_X01_Y01_Z01.tif',
    channel_file = data_path + 'channelnames.txt',
    output_dir = output_dir,
    output_fname = 'tonsil2',
    seg_method ='cellpose', # cellpose or mesmer
    model='cyto3', # cellpose model
    diameter=28, # average cell diameter (in pixels). If set to None, it will be automatically estimated.
    nuclei_channel = 'DAPI',
    membrane_channel_list = ["CD3", "CD34"], #default is None #default is None; if provide more than one channel, then they will be combined
    input_format ='Multichannel', # Phenocycler or codex
    resize_factor=1, # default is 1; if the image is too large, lower the value. Lower values will speed up the segmentation but may reduce the accuracy.
    size_cutoff = 0) 
--- Initializing Segmentation Pipeline ---
No GPU detected by TensorFlow.
Output basename: tonsil2
Segmentation method: cellpose
Differentiate Nucleus/Cytoplasm: False
--- Loading Image Data (Format: Multichannel) ---
Loading image: /home/jiawu2/SPACEc_image/results/reg002_X01_Y01_Z01.tif
Loaded image shape: (59, 2546, 2660)
Loaded 59 channel names from: /home/jiawu2/SPACEc_image/data/channelnames.txt
Formatting Multichannel image (59 channels)...
Formatted 59 Multichannel channels.
Image dictionary created with 59 channels.

--- Preparing Segmentation Inputs ---
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
Segmentation dictionary prepared with channels: ['DAPI', 'segmentation_channel']
Resize factor is 1, skipping image resizing.
Image shape for segmentation: (2546, 2660)

--- Processing Full Image for Segmentation ---

--- Segmentation Mode: Standard ---
Processing full image...
Using Membrane (Red=1) and Nucleus (Blue=3) channels for Cellpose.
Running Cellpose: model='cyto3', custom=False, diameter=28, channels=[1, 3], gpu=False
Error initializing Cellpose model 'cyto3': name 'cellpose_models' is not defined
Error during Cellpose segmentation run: name 'cellpose_models' is not defined
Warning: cellpose returned None.
An error occurred during the segmentation stage: Full image segmentation failed.

You can also load a custom fine-tuned cellpose model into SPACEc as shown below. The required input file is the zip file that cellpose outputs after training.

In [ ]:
#OPTIONAL: need train your own one
seg_output_cellpose = sp.tl.cell_segmentation(
    file_name = output_dir + 'reg002_X01_Y01_Z01.tif',
    channel_file = data_path + 'channelnames.txt',
    output_dir = output_dir,
    output_fname = 'tonsil2',
    seg_method ='cellpose', # cellpose or mesmer
    model='/home/user/path_to_custom_model/models/CP_XXXX_XXXX', # cellpose model
    diameter=28, # average cell diameter (in pixels). If set to None, it will be automatically estimated.
    nuclei_channel = 'DAPI',
    membrane_channel_list = ["CD45", "betaCatenin"], #default is None #default is None; if provide more than one channel, then they will be combined
    input_format ='Multichannel', # Phenocycler or codex
    resize_factor=1, # default is 1; if the image is too large, lower the value. Lower values will speed up the segmentation but may reduce the accuracy.
    size_cutoff = 0,
    custom_model=True) 

Viusalizing the segmentation result¶

Not every dataset works equally well with all segmentation models due to differences in tissue type, structure or image quality. Therefore, it is of major importance to check the segmentation results before continuing with the data analysis. the 'show_masks' function selects random tiles of a user defined size from the image to provide examples to evaluate the segmentation quality. If the segmentation quality is not acceptable a different model should be tried. For especially challenging datasets users can also try to retrain a model specifically for their images.

In [12]:
overlay_data1, rgb_images1 = sp.pl.show_masks(
    seg_output=seg_output1,
    nucleus_channel = 'DAPI', # channel used for nuclei segmentation (displayed in blue)
    additional_channels = ["CD3", "CD34"], # additional channels to display (displayed in green - channels will be combined into one image)
    show_subsample = True, # show a random subsample of the image
    n=2, #need to be at least 2
    tilesize = 300,# number of subsamples and tilesize
    rand_seed = 1)
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
No description has been provided for this image
No description has been provided for this image
In [14]:
overlay_data2, rgb_images2 = sp.pl.show_masks(
    seg_output=seg_output2,
    nucleus_channel = 'DAPI', # channel used for nuclei segmentation (displayed in blue)
    additional_channels = ["CD3", "CD34"], # additional channels to display (displayed in green - channels will be combined into one image)
    show_subsample = True, # show a random subsample of the image
    n=2, #need to be at least 2
    tilesize = 300, # number of subsamples and tilesize
    rand_seed = 3) 
Combining channels ['CD3', 'CD34'] into 'segmentation_channel' using max projection.
No description has been provided for this image
No description has been provided for this image

Save the segmentation result¶

After successful segmentation, the images and masks can be stored in a pickle file for later easy access.

In [15]:
#Save segmentation output
with open(output_dir + 'seg_output_tonsil1.pickle', 'wb') as f:
    pickle.dump(seg_output1, f)

with open(output_dir + 'seg_output_tonsil2.pickle', 'wb') as f:
    pickle.dump(seg_output2, f)
    
#Save the overlay of the data
with open(output_dir + 'overlay_tonsil1.pickle', 'wb') as f:
    pickle.dump(overlay_data1, f)

with open(output_dir + 'overlay_tonsil2.pickle', 'wb') as f:
    pickle.dump(overlay_data2, f)
In [ ]: